Identification of Basic Phrases for Kazakh Language using Maximum Entropy Model

نویسندگان

  • Gulila Altenbek
  • Xiaolong Wang
  • Gulizhada Haisha
چکیده

This paper proposes the definition, classification and structure of the Kazakh basic phrases, and sets up a framework for classifying them according to their syntactic functions. Meanwhile, the structure of the Kazakh basic phrases were analyzed; and the determination of the Kazakh basic phrases collocation and extraction of the Kazakh basic phrases based on rules were followed. The Maximum Entropy (ME) model uses for the identification of the phrases from texts and achieved a result of automatic identification of Kazakh phrases with an accuracy of 78.22% based on rules System and additional artificial modification. Design feature of this ME model join rely on templates of Kazakh Word, part of speech, affixes. Experimental results show that the accuracy rate reached 87.89%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending the Hierarchical Phrase Based Model with Maximum Entropy Based BTG

In the hierarchical phrase based (HPB) translation model, in addition to hierarchical phrase pairs extracted from bi-text, glue rules are used to perform serial combination of phrases. However, this basic method for combining phrases is not sufficient for phrase reordering. In this paper, we extend the HPB model with maximum entropy based bracketing transduction grammar (BTG), which provides co...

متن کامل

Maximum likelihood and discriminative training of direct translation models

We consider translating natural language sentences into a formal language using direct translation models built automatically from training data. Direct translation models have three components: an arbitrary prior conditional probability distribution, features that capture correlations between automatically determined key phrases or sets of words in both languages, and weights associated with t...

متن کامل

Spatial Simulation and Land-subsidence Susceptibility Mapping Using Maximum Entropy Model

The aim of this research is spatial Simulation and land subsidence susceptibility mapping using maximum entropy model in Jiroft and Anbarabad Townships. At first, land subsidence locations were recognized using extensive field surveys and subsequently the land subsidence distribution map was made in the geographic information system. Then, each of effective factors on land subsidence occurred i...

متن کامل

Automatic identification of command boundaries in a conversational natural language user interface

In this paper, we propose a trainable system that can automatically identify the command boundaries in a conversational natural language user interface. The proposed solution makes the conversational interface much more user friendly, and allows the user to speak naturally and continously in a hands-free manner. The main ingredient of the system is the maximum entropy identification model, whic...

متن کامل

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014